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Foreword 



rd , 



This Technical Specification has been produced by the 3 Generation Partnership Project (3GPP). 

The contents of the present document are subject to continuing work within the TSG and may change following formal 
TSG approval. Should the TSG modify the contents of the present document, it will be re-released by the TSG with an 
identifying change of release date and an increase in version number as follows: 

Version x.y.z 

where: 

X the first digit: 

1 presented to TSG for information; 

2 presented to TSG for approval; 

3 or greater indicates TSG approved document under change control. 

y the second digit is incremented for all changes of substance, i.e. technical enhancements, corrections, 
updates, etc. 

z the third digit is incremented when editorial only changes have been incorporated in the document. 



Introduction 



Dynamic and Interactive Multimedia Scenes (DIMS) is a dynamic, interactive, scene-based media system which 
enables display and interactive control of multimedia data such as audio, video, graphics, images and text. It ranges 
from a movie enriched with vector graphics overlays and interactivity (possibly enhanced with closed captions), to 
complex multi-step services with fluid interaction/interactivity and different media types at each step. The demand for 
such Rich Media service is increasing at a high pace, spurred by the development of the next generation mobile 
infrastructure and the generalization of TV content to new mobile environments. 

In the case of a video portal application, subscribers can watch TV, video and audio enriched with additional data 
(graphics, text, images) in streaming, online, progressive download or offline mode. DIMS provides a convenient and 
natural way to browse rich-media services, a web-like access (content available in less than three clicks, easy discovery, 
no learning curve), a permanent refresh of content through dynamic updates available on the fly and decreasing latency 
by allowing the visualization of data as soon as possible. 

Content can be synchronized up to a frame-accurate basis (e.g. to ensure content providers and operators that voting 
will start and stop at a precise time during a vote within an interactive show, to allow karaoke text flows). 
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Scope 



DIMS defines a dynamic rich-media media system, including a media type, its transport, packaging, delivery, and 
interaction with the local terminal, user, and other local and remote sub-systems. Enhanced end-user experiences are 
provided by the coordinated management and synchronization of media and events, combined with end-user interaction. 

The DIMS media type can be used as a generic media type, allowing creating dynamic interactive rich-media services 
and can also benefit, or be used in association with other media types (e.g.: audio codecs, video codecs, XHTML 
browser, etc.). 
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Definitions and abbreviations 



3.1 Definitions 

For the purposes of the present document, the following terms and definitions apply: 

DIMS Scene: SVG scene, which may include extensions, and may be updated over time 

DIMS Unit: basic unit of transport, processing, and compression, of DIMS content 

New Scene: complete scene (containing an "svg" element), suitable for starting a session or completely replacing the 

current scene in a session 

(Functions very similarly to an I-frame in video) 

Normal DIMS Unit: DIMS Units processed when processing a sti'eam (cf Redundant DIMS Unit) 

Primary Stream: stream which defines the complete scene tree, i.e. in which all random access points are, or build, a 
complete DIMS Scene 

Redundant DIMS Unit: DIMS Units which supply a redundant 'summary' of the stream, and which can be used for 
random access, tune-in, or error recovery (cf. Normal DIMS Unit) 

Scene Update: set of differences that make changes to the scene in the current session 
(Similar to a P-frame in video) 
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Secondary Stream: stream which manages only a portion of the scene tree 



3.2 



Abbreviations 



For the purposes of the present document, the following abbreviations apply: 

API Application Program Interface 

AVP Audio/Video Profile 

CTR CounTeR 

DIMS Dynamic Interactive Multimedia Scene 

DOM Document Object Model 

FEC Forward Error Correction 

FLUTE File deLivery over Unidirectional Transport 

HTTP Hyper Text Transfer Protocol 

lANA Internet Assigned Numbers Authority 

ID IDentifier 

LASeR Lightweight Application Scene Representation 

MIME Multipurpose Internet Mail Extensions 

MMS Multimedia Messaging Service 

MT Media Time 

MTU Maximum Transmission Unit 

PSS Packet switched Streaming Service 

RAP Random Access Point 

RTP Real-Time transport Protocol 

RTSP Real Time Streaming Protocol 

RU Recovery Unit 

SDP Session Description Protocol 

SI Switch Intra (I) frame 

SMIL Synchronized Multimedia Integration Language 

SU Scene Update 

SVG Scalable Vector Graphics 

TCP Transmission Control Protocol 

UAProf User- Agent Profile 

uDOM microDOM 

UDP User Datagram Protocol 

UE User Equipment 

URL Uniform Resource Locator 

URN Uniform Resource Name 

W3C World Wide Web Consortium 

XHTML extensible HyperText Markup Language 

XML extensible Markup Language 
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Overview and architecture 
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Figure 4-1 : General architecture of the rich media system 

The rich media system can be perceived as client-server architecture, comprising 3 main components: The rich media 
server, transport mechanisms and the rich media client. Figure 4-1 illustrates the general architecture. The server takes as 
input, rich media content comprised of scene description, discrete (e.g. images) and continuous (e.g. audio, video) 
media. Scene description can be dynamically updated through scene updates. The rich media content can be 
encapsulated into a container format, containing additional information such as media synchronization, metadata, and 
hint tracks for packetization. The system then utilizes various transport mechanisms for 1-to-l and 1 -to-many protocols 
for download, progressive download and streaming scenarios. The content is played on the client, allowing for local and 
remote interactivity of feedback and data requests. 



5.1 



IVIedia-type definition 



Introduction 



The DIMS media type allows spatial and temporal layout of the multimedia scene. This scene can consist of any 
combination of still pictures, video, audio, and animated graphics. It includes an update mechanism that allows for 
partial updates of the existing scene, as well as updating the presentation with a completely new scene and streaming 
tune-in functionality. 

5.2 IVIedia type components 

The DIMS media type consists of: 

• Base scene description, which is SVG Tiny 1.2 [1]. 

• Scene description extensions. 

• Scene commands. 

• Event generation and processing. 
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5.3 Namespace 

The namespace called DIMS here is associated with the URN " http://www.3gpp.org/richmedia/ ". 

5.4 Scene description 

5.4.1 Base Scene Description 

SVG Tiny 1.2 provides the basic DIMS Scene functionality; layout, inclusion and referencing of objects, 
synchronization of object timelines and a rendering model. 

The full syntax and semantics of SVG Tiny 1.2 shall be supported for DIMS Scene functionality. The version and 
baseProfile attributes of the SVG element document the version and profile of SVG on which this scene is based. 

5.4.2 Scene Description Extensions 

5.4.2.1 Introduction 

Extensions defined here are designed so that: 

a) when the same functionality is present in profiles of SVG other than SVG Tiny 1 .2, then the extension is 
compatible with that or a restricted version of that. 

b) A terminal implementing both the present document and SVG (any version) can use a common implementation 
of the DOM tree, scene graph, rendering model etc. without having variant handling that depends on whether the 
scene was built using DIMS or SVG. 

c) No extensions are required to be present in all documents; content authored to the SVG Tiny 1.2 specification 
may be used as the initial scene of a stream designed to the present document. 

The following extensions are defined here. 

5.4.2.2 Rectangular clipping of a graphical object 

The lsr:rectClip mechanism provides pixel aligned clipping defined as a transformable rectangle. 

The lsr:rectClip element shall be supported. The definition of lsr:rectClip is defined in subclause 6.8.28 of [3]. 

5.4.2.3 Full-screen video 

The full-screen video feature consists of the attribute lsr:fullscreen on the SVG video element. 

The Isr: fullscreen element shall be supported. The lsr:fullscreen attribute is defined in subclause 6.8.40.2 of [3]. 

See clause 10 for security considerations of fullscreen. 

5.4.2.4 Full-screen SVG 

The fullscreen SVG feature in the DIMS namespace consists of an attribute 'fullscreen' on the <svg> element to hint 
that the scene should be rendered on the full screen. The possible values are "true" and "false" (default). With the 
attribute set to true the DIMS UE should negotiate the rendering area with its parent UE and get as large part of the 
screen as possible for the DIMS canvas. 

See clause 10 for security considerations of fullscreen. 

5.4.2.5 Attributes clipBegin and clipEnd 

Attributes clipBegin and clipEnd defined in subclause 7.6.1 of [5] shall be supported on the following elements: video, 
audio, animation, and the "updates" element as described in subclause 5.4.2.6. 
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5.4.2.6 Update Streams 

The present document defines a new element 'updates' in the DIMS namespace to link secondary streams of updates to a 
scene. This element has an implicit "simple duration" of 'indefinite'. The synchronization attributes defined in [1] 
subclause 12.6 can be used with this element. 

NOTE: lsr:updateSource defined in subclause 6.8.59 of [3] is a superset of this element. 

Attribute definitions: 

All timing attributes defined in [1] subclause 16.2.7 are defined for this element, except the "fill" attribute. 

The attributes clipBegin and clipEnd defined in subclauses. 4. 2. 5, and syncReference defined in subclause 5.4.2.7, are 
defined for this element. 

xlink:href= "<iri>" 

An IRI reference to an update document or a DIMS stream/file, this attribute specifies the location of the stream 
of updates. In the absence of this attribute, this element does not have any effect. This attribute is not animatable 
and not inheritable. 

5.4.2.7 Synchronization of Media Streams 

lsr:syncReference = "<iri>" 

The elements video, audio, and animation from SVG, and 'updates' from the present document, may have the attribute 
lsr:syncRef from subclause 6.8.8.2 of [3], with the associated synchronization behaviour. This attribute holds a 
reference to the stream or media element whose clock acts as a clock reference for the stream referred to by this 
element. This attribute is not animatable and not inheritable. 

5.4.2.8 Screen orientation 

Two events and two feature strings are defined that make it possible for scenes to adapt to the screen layout. The events 
are defined in subclause 6.1.3. 

Whenever the terminal detects a change of orientation, angle, or screen size, one of these two events is dispatched. A 
portrait event is dispatched if the screen is taller than it is wide, and a landscape event is dispatched if the screen is 
wider than it is tall. It is the responsibility of the system below the scene to orient the screen buffer to user; the DIMS 
Scene author does not do this. 

The angle between the long (primary) axis of the screen and vertical is reported in degrees in screenAngle, to the best of 
the terminal's capability. This angle is measured clockwise from vertical (see diagram) and would normally be close to 
or 180 in portrait events, and close to 90 or 270 in landscape events. 
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Figure 5-1 : Screen Orientation 

These events have the following interface. 

interface ScreenOrientationEvent ; Event 
{ 

readonly attribute unsigned long screenWidth; 

readonly attribute unsigned long screenHeight; 

readonly attribute unsigned long screenAngle; 
) 

screenWidth - contains the new screen display or viewport width. 

screenHeight -contains the new screen display or viewport height. 

screenAngle - documents the angle between the primary axis of the screen, and vertical. 

The screen orientation events shall be supported in DIMS. If the terminal has an orientation sensor, or other physical 
adaptation that causes the available screen drawing area to change (e.g. a partial cover), events shall be generated 
whenever the terminal detects a change in any of the parameters to these events. These events may be used in the 
following circumstances: 

1) To register event listeners based on the screen orientation events so that the script can be invoked when the event 
occurs. This can be done either through the application using uDOM APIs or declaratively via the <ev:listener> 
element with <ev:event> attribute set to one of the screen orientation events and invoking the appropriate 
<handler> element. 

2) Timed Elements that can be defined to begin or end based on screen orientation events. 

The following feature strings shall also be supported, in order to allow the use of the switch element: 

• orientLandscape for typical 'landscape' orientation; 

• orientPortrait for typical 'portrait' orientation. 

The namespace of these feature strings is ffs. 

If the most recent event generated was a portrait event, then the portrait feature tests as true; if the most recent event 
was a landscape event, the landscape feature tests as true. At any time, exactly one of these features shall test as true. 

The referencing of key mapping from OMA RME in the DIMS specification is ffs. 
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5.4.2.9 Current-Time Indication 

In a Primary Stream, Redundant Random- Access Point there is a need to establish the current SceneTime of the scene, 
so that terminals tuning-in, performing random-access, or recovering from a lost high-priority DIMS Unit achieve the 
same SceneTime as terminals which had processed the entire stream from the most recent non-redundant Random 
Access Point. This media-time (scene time) is indicated by the current_scene_time attribute on the SVG element, and 
takes a valid clock value in the document timeline, from the SVG specification [1]. 

The scene state is set exactly as if the SVG document had been loaded and displayed at the non-zero time T in the 
current-time indicator. 

EXAMPLE: This is the same as if this SVG scene had been used in a SMIL document as the target of an 

"animation" element with clipBegin of T, or if conceptually all absolute times S in the document 
were replaced with S-T and the document instantiated at time 0. 

Attribute definition: 

current_scene_time = "<clock-value>" 

Specifies the current scene time (a valid clock value) in the document timeline, at which the scene is displayed. The 
scene state is set exactly as if the SVG document had been loaded and displayed at the non-zero time T in the current- 
time indicator. This attribute is defined in DIMS namespace, and may be present on the root SVG element in redundant 
random access points. The default value is zero. 

5.4.2.10 Active attribute 

On all SVG elements, the following optional attribute is defined in the DIMS namespace: 

active: this attribute defines whether the element is active. The possible values are "true" (default) and "false". 

Setting the value of this attribute to true or false is equivalent to executing the commands activate and deactivate. See 
subclause 5.5.3 for the behaviour of deactivated elements. 

5.5 Scene Commands 
5.5.1 Scene Updates 

The scene update mechanism allows reception of updates that change parts of the current scene, without having to 
replace the entire scene. 

To account for the different update scenarios two update mechanisms are defined: 

• Primary-stream updates: Updates are delivered to the client in the same stream as the original scene. 

• Secondary-stream updates: Updates are delivered to the client in separate streams from the original scene, 
e.g. in an interactive scenario or initiated from the scene mark-up. 

In a primary-stream case, the updates and/or scene replacements are sent in the same stream as the initial scene. The 
temporal management of samples in a primary stream is based upon transport level timestamps. A secondary stream is a 
stream that does not contain the initial scene. A secondary stream is initiated directly from the DIMS mark-up using the 
'updates' element. 

The following LASeR commands from subclause 6.7 of [3] in LASeR ML format shall be supported. 

• The LASeR Insert command from subclause 6.7.5 of [3] shall be supported on elements, attributes and values in 
list attributes with the following relaxed constraints: values may be inserted on attributes x and y of the text 
element. 

• The LASeR Delete command from subclause 6.7.4 of [3] shall be supported on elements, attributes and values in 
list attributes. 
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• The LASeR Replace command from subclause 6.7.8 of [3] shall be supported on elements, attributes and values 
in list attributes with the following relaxed constraints: attributes attributeName, id, type, xml:space, 
preserveAspectRatio and the x and y attributes of the text element can be replaced. There are no restrictions on 
the value of attributeName. The text regarding executionTime does not apply. 

• The LASeR Add command from subclause 6.7.2 of [3] shall be supported. 

5.5.2 State management commands 

The following state management commands shall be supported: 

• The LASeR Save command from subclause 6.7.10 of [3] shall be supported. 

• The LASeR Restore command from subclause 6.7.9 of [3] shall be supported. 

• The LASeR Clean command from subclause 6.7.3 of [3] shall be supported. 

These LASeR commands are defined as an interface to persistent storage. Selected scene information, are cached on a 
best effort basis. The security principles behind this caching closely follow the state caching mechanism in HTTP, 
commonly called cookies [23]. 

The saved data is defined by a save ID and scoped by the "service"; for a command to operate on the data, both must 
match. 

The LASeR command "save" saves the values of a set of attributes, each identified by elements ID and attribute name. 
Each save operation uses a save ID. 

The LASeR command "restore" restores the attributes (if any) previously saved and scoped by the domain-name and 
path. The set of data restored is defined by the save ID. 

The LASeR command "clean" erases the storing area for a particular save ID. The element information stored in the 
corresponding memory area is not available anymore. 

The following two attributes are defined in the stream signalling, and define the security restrictions for the above 
commands: 



• useFullRequestHost: this Boolean attribute indicates whether the full domain name of the request-host is used 
(1) or the first component of the domain name is elided (0). For example, if the source material came from 
"www.example.org", then this differentiates between associating the "service" with "www.example.org" and 
".example.org". (Note the definition of local names in the RFC, and the possibility to associate the "service" with 
locally loaded files, and that the domain name may be either "<hostname>.locar' or ".local" in that case.). 
Together with pathComponents, this attribute defines the "service". 

• pathComponents: this integer attribute indicates how much of the source path is used. If this takes the value 0, 
then the "service" is not associated with a path, and if it takes the special value 15 (or any value equal to or 
greater than the number of components in the path) then the entire path is used up to but excluding the final file- 
name. For example, if the source was "/user/laser-expert/demo/art.mp4" then a value of 4 or greater selects 
"/user/laser-expert/demo/art.mp4" as the path, the value 2 selects "/user/laser-expert" and the value zero sets no 
path. Together with useFullRequestHost, this attribute defines the "service". 

The above security attributes are ffs. 

5.5.3 Activate and Deactivate 

The commands activate and deactivate as defined in subclauses 6.7.12 and 6.7.13 of [3] shall be supported, in a manner 
that is functionally equivalent to that specification. These commands have one attribute: 

• ref: the id of the element which is to be activated or de-activated. 

NOTE: When an element is deactivated, the system then treats the DOM tree as if that element and its 

descendents were not present in the DOM tree, and invisible to everything except commands and scripts. 
Commands and scripts can reference it as if it were still in the DOM tree. When activated, the element is 
then restored to visibility, in the same location in the tree as if it had not been previously deactivated. 
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5.5.4 Distributed Random Access Points 



5.5.4.1 



Introduction 



A Distributed Random Access Point (DRAP) is a redundant DIMS tune-in point (either primary or secondary) that can, 
instead of explicitly defining all elements itself, reference elements in coming DIMS units. The commands in these 
following DIMS units are not executed, elements are simply copied according to references in the DRAP. These 
references can be used to reduce redundancy (i.e. not defining an element both in a RAP and an update) or to simply 
spread the size of the RAP over a period of time. 

After this copying operation, the pending action(s) in the DRAP are complete, and are then executed, and normal 
processing resumes. 
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Figure 5-2: Illustration of the DRAP Concept 



5.5.4.2 



DRAP syntax and semantics 



The rootmost element in a DRAP document shall be a <drap> element in the DIMS namespace. A DIMS Unit 
containing a DRAP shall contain only the DRAP. 

Attribute definitions: 

unitsrequired="Mn/f.s-re^M/ret/" 

Indicates the number of coming DIMS units required. 

NOTE 1 : These DIMS units are not executed in the normal way when tuning in using DRAP; instead, they are used 
as needed as a source of material for the DRAP. 

The DRAP element contains one or more getfromupdate elements, which form the processing instructions, and one or 
more other elements that form the pending action(s). The processing instructions are applied to the pending action. The 
indicated number of DIMS units are processed for the DRAP, and the pending action(s) are performed at the time of the 
DIMS unit at the indicated distance, and normal DIMS unit processing resumes. 

NOTE 2: All the getfromupdates should have been resolved by the indicated distance. 



£75/ 



3GPP TS 26.1 42 version 7.0.0 Release 7 1 6 ETSI TS 1 26 1 42 V7.0.0 (2007-06) 

The getfromupdate element shall reference an element in another DIMS unit and an element in the pending actions. The 
element referred to in the other DIMS unit shall replace the element in the DRAP pending actions. 

Attribute definitions: 

souice=" elementid" 

Specifies an xml id appearing in an upcoming DIMS unit. If the same xml id appears in different DIMS units, it 
shall not make a difference which one the client chooses. 

target-' elementid" 

Specifies an xml id appearing in the DRAP pending action(s). 



5.5.5 Immediate Script Execution 



The doScript command in the DIMS namespace shall be supported. This command supplies a script for immediate 
execution, including the ability to update the DOM. It has a single attribute, the type of the script. The script is in the 
body of the element. Processing this command involves executing the script in the context of the DIMS stream in which 
it occurs. 

Attributes: 

type - is a string that identifies the scripting language used. It takes a suitable MIME type [18] from the lANA 
registry, such as "application/ecmascript" (see [13]). 

An example is: 

<doScript type="application/ecmascript"> 

var root = document . getDocumentElement ( ) ; 
var myGroup = document . creatElementNS ( 

"http : //www. w3 . org/2000 /svg" , "group" ) ; 
myGroup. setid ( "myGroup" ) ; 

myGroup . setTrait ( "visibility" , "hidden" ) ; 
root . appendChild (myGroup) ; 
var myRect = document . creatElementNS ( 

"http ; //www. w3 . org/20 /svg" , "rect " ) ; 
myRect . set Id ( "myRect" ) ; 

var color = root . createRGBColor ( 255, 0, 0); 
myRect . setRGBColorTrait ("fill", color) ; 
myRect . setFloatTrait ("x", 10); 
</doScript> 

5.5.6 Seel^ing in tine DIMS Stream 

<seek seekOffset=" seekOffset" l> 

Attributes: 

seekOf f set: A clock value from subclause 16.2.7 of [1]. 

The command seek in the DIMS namespace results in a seek, by the amount seekOffset, in the DIMS stream timeline. 
The target stream time is obtained by adding seekOffset to the current stream time. As a DIMS stream may contain 
multiple scenes, this seeking can result in a change of scene. The (possibly new) scene shall also be seeked to the local 
time corresponding to the seeked global time. 

NOTE: Seeking can be conceptually seen as a function where the global timeline and document timelines are 
moving forward in a synchronized manner, just as in normal playback, but more quickly and without 
rendering. A seek backwards in time (negative seekOffset) could be done in a similar way, but by starting 
again from zero and moving forward. 
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5.6 DIMS Unit Definition 

5.6.1 Definition 

A DIMS Unit is built from a header and a body. The DIMS Unit Body is either: 

a. a complete SVG document as specified in subclause 5.4, possibly using extensions; or 

b. a textually concatenated sequence of scene commands as specified in subclause 5.5; 

A DIMS Unit Body may be compressed. 

DIMS Units are framed by the transport layer. Each DIMS Unit has certain characteristics, signalled by the DIMS Unit 
Header. 

There are DIMS Units used in redundant processing, and DIMS Units used in normal processing. Redundant DIMS 
Units, and DIMS Units marked as random-access points, are used in random access, tune-in, and error recovery; for a 
full description of their processing model, see subclause 5.8. 

5.6.2 DIIVIS Unit Header 

DIMS Unit Header is 1 byte long. The length of a DIMS Unit is the length of the DIMS Unit Body plus the length of 
the DIMS Unit Header. DIMS Unit lengths are carried by the transport layer. 

The DIMS Unit Header has the following layout. 

+ + 

0|1|2|3|4|5|6|7 

X |C|P|D|I|M|S 

+ + 

Figure 5-3: DIMS Unit header 

These fields have the following definitions: 

S: is-Scene: when 1, indicates that the DIMS Unit contains a Scene Description as documented in 

subclause 5.4; when 0, indicates that the DIMS Unit contains one or more Scene Commands as 
documented in subclause 5.5. 

M: is-RAP: when 1, indicates a Random Access Point; when 0, indicates a non-Random-access point. 

I: is-redundant: when 0, indicates a main (normal processing) DIMS Unit; when 1, signals a redundant DIMS 

Unit. 

D: redundant-exit: shall be on DIMS Units with is-redundant==0; on DIMS Units with is-redundant==l, when 
1, indicates that redundant processing is completed by this DIMS Unit, and normal processing 
should begin, and when 0, indicates that redundant processing should continue. 

P: priority set to 1 indicates a high-priority unit; when set to indicates a low-priority unit. A unit should 

be marked as low-priority if all of the following are true if this DIMS Unit is lost or not 
processed by the terminal, and shall be marked as high-priority otherwise: 

1. all succeeding DIMS Units can be decoded and operated on without error (e.g. their DOM 
updates do not depend on the possibly lost command(s). 

2. the visual and semantic nature of the scene is satisfactory to the content author. 

DIMS Units with is-redundant set to 1 should normally be marked as low-priority, to avoid 
their loss causing an un-needed entry into tune-in state when redundant and normal data are 
carried in the same transport. 
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C: compression: indicates the compression applied; 

indicates no compression (textual format); 

1 indicates that the content is compressed using the encoding signalled in stream setup. 

X: reserved: shall be set to and should be ignored 

NOTE: The setting of the priority field is, due to point 2 above, partly at the discretion of the content creator. An 
example of a simple method of evaluating point 2 is to see if, when the next packet is received, the 
terminal state is identical to what it would have been if the DIMS Unit(s) had not been lost in the first 
place. 



5.7 Timing model 



The SVG timing model applies, with the addition of the processing of updates at their times. All events and "ready" 
updates are applied at their time. The relative timing of updates and events with the same activation times is not defined 
by the present document. 

DIMS inherits timed elements from SVG Tiny 1.2 and defines an additional one: the updates element, which supports 
the same timing and synchronization as the media elements. DIMS uses the run-time synchronization functionality that 
SVG [1] inherits from SMIL [5]. 

The scene time is set to zero when an initial scene is loaded. Scene time advances according to the SVG Tiny 1.2 timing 
model. 

Logically XML fragments are sent in DIMS units which have Media Time timestamps (MT). The timing model defines 
how to translate these into scene time. These media time stamps may not have a known origin, and are expressed on a 
timescale declared at the transport layer. Note that the equations below do not show the correction for timescale units, 
for simplicity. 

We define a NewScene DIMS unit as one DIMS unit containing an "svg" element. The media timestamp MT(ns) of that 
DIMS unit is arbitrary, but the defined SceneTime of it is zero; ST(ns) = 0. 

ST(AUwithNewScene) = 

The XML fragment which supplies the construct "r" is sent in a later DIMS unit with media timestamp MT(r). The 
defined scene time of that DIMS unit is: 

If there was a NewScene in this stream: 

ST(r) = MT(r) - MT (AUof LastNewScene) 

If there was no NewScene in this stream: 

ST(r) = MT(r) - MT (FirstAU) + streamOffset 

streamOffset is determined by the syncBehavior attribute just as for other media elements. 
StreamOffset is determined as: 

• The resolved value of the begin attribute of updates pulling that stream if syncBehavior="locked". 

• When syncBehavior is not locked, the streamOffset is equal to the scene time when the first AU of the secondary 
stream is applied, and may change during the stream playback (e.g. buffer underrun). 

The processing model for scene updates is the same as for script and event processing. The present document does not 
mandate any processing order for simultaneous scripts, events and updates. The present document does not mandate any 
processing order for DIMS units, scripts or events that shall be processed at a single time instant. A DIMS unit does 
however have exclusive access to the scene tree during processing. DIMS Units shall be processed in decoding order, 
i.e. sequence number order in RTP or order inside a sample in the 3GP file format. DIMS Units with the same 
timestamp in the same stream are applied 'instantaneously' in media time, that is, the media time does not change while 
they are being applied. 

This subclause is ffs. 
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5.8 Processing Model 



A Scene Description is processed as a complete replacement for the current scene tree. That is, the entire DOM is 
discarded and replaced with the result of parsing the SVG element. All other DIMS Units retain (and possibly modify) 
the current scene tree. 

All high-priority data units not marked as redundant shall be processed during normal decoding. All low-priority data 
units not marked as redundant should be processed during normal decoding. Data units marked as redundant should be 
ignored during normal processing. All data units marked as RAP are suitable tune-in points. When tuning-in to a 
stream, decoding shall begin by, at the latest, the first unit marked as RAP irrespective of the value of its is-redundant 
flag. 

If a normal (non-redundant) random access point is identified during redundant processing or DRAP processing, the 
normal random access point should take precedence. 

Commands that cannot be executed (e.g. they refer to a DOM node which does not exist) shall be ignored when in tune- 
in or redundant-processing. This condition should not arise in normal processing, and their handling in this state is not 
defined by the present document. 

The following state diagram, and processing pseudo-code for each state, illustrate the states and the use of the various 
flags in the DIMS unit header, and comply with the mandatory and recommended processing requirements. The state 
diagram and pseudo-code, or better, should be implemented in DIMS clients. 

In the state diagram and pseudo-code, the terminal may be processing a stream under one of three conditions: 

a) normal processing, 'normal'; 

b) after tuning in, performing random access, or when loss is detected, 'tune-in'; 

c) while processing redundant DIMS units, 'process-redundant'. 
Tune-in state is entered under any of the following circumstances: 

a) after opening a stream; 

b) after performing random access; 

c) after loss of a high-priority DIMS Unit in normal processing; 

d) after loss of any DIMS Unit in redundant processing. 
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any loss 



high-priority 
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normal RAP or 
redundant-exit 




Figure 5-4: DIMS Client State Diagram 

The following behaviour is performed for each DIMS Unit in each state, and then a state transition is performed as 
indicated by the state diagram. 

Normal state: 

if DIMS-Unit . is-redundant 
then Discard (DIMS-Unit) 
else Process (DIMS-Unit) ; 

Tune-in state: 

if DIMS-Unit .is-RAP 

then Process (DIMS-Unit) 
else Discard (DIMS-Unit) , • 

Redundant -processing : 

if ( DIMS-unit . is-redundant) I I 

((not DIMS-Unit .is-redundant) && DIMS-unit . is-RAP ) 
then Process (DIMS-unit ) 
else Discard (DIMS-Unit) ; 

Where this pseudo-code indicates that a DIMS Unit is processed, then if a Distributed Random Access Point (DRAP) is 
in process, elements required by the DRAP are extracted from this unit. If a DRAP is not in process, the DIMS Unit is 
processed as normal. If one of the DIMS units identified by units-required is a normal (non-redundant) random access 
point, DRAP processing should be abandoned, and that normal RAP processed in the usual way. 
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5.9 Random Access, Tune-in and Error Recovery 

5.9.1 Introduction 

Random access points in streams are either essential random access points or redundant random access points. Essential 
random access points are processed by terminals in all states. Redundant random access points should only be processed 
only by terminals needing to perform random access, tune-in, or error recovery. 

Random access points are indicated in the DIMS Unit header using the is-RAP flag. Redundant random access points 
have the is-Redundant flag set to 1 ; essential Random Access points have this flag set to 0. 

5.9.2 Random Access Points in Primary Streams 

A Random Access Point (essential or redundant RAP) in a primary stream shall either contain an entire scene (i.e. be a 
Scene Description) or the mechanism to build an entire scene (such as DRAP). When used, this scene becomes the 
current scene and replaces all previous data. There may be further DIMS Units with the same timestamp that modify the 
scene tree. 

A redundant Random Access Point in a primary stream shall have the current_scene_time attribute on the SVG element. 
Any following commands in subsequent DIMS Units with the same timestamp are processed at this time. 

5.9.3 Random Access Points in Secondary Streams 

A Random Access Point (essential or redundant RAP) in a secondary stream shall either contain an entire update (i.e. a 
series of commands) or the mechanism to build an entire update (such as DRAP). The command(s) provided set the 
scene (specifically, the portion of the scene managed by the secondary stream) into an appropriate state, whether the 
random access point is used for initial tune-in, or for error recovery. 

NOTE: The secondary stream needs to be encoded in such a way that it does not matter which packets were lost 
or this is an initial tune-in or random access - the appropriate state is set. This would include removing 
any elements or attributes which should have been removed, etc. A simple way of encoding such a stream 
would be to only let updates in a secondary stream make modifications to a few nodes. Then this 
operation could be as simple as removing these few nodes and reinserting them, removing all potential 
errors. 

5.9.4 Error Recovery 

There are several error resilience mechanisms available in DIMS. Among these are: 

• Priority: By separating essential and non-essential units one can determine if a loss need repair or not. This is 
described in subclause 7.3.1. 

• Periodic Random Access Points (RAPs): Random Access Points can be placed periodically in a stream. In the 
case of error one can tune-in to the channel again. 

• Separation of static and dynamic data. This can even increase the efficiency of Distributed Random Access 
Points. 

A combination of these methods can be used. 
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Interaction and Scripting 



6.1 



Local interaction 



6.1.1 DOM Level 3 events 

The supported local events and their management in DIMS are built upon the events model described in [1]. 

They include DOM Events (focus, activate, etc.), SVG Events (connection, load, etc.) and general XML events [19] 
(user events, timing, key, and pointer events). 

6.1 .2 Media Access Events 

The media access events defined in [20] shall be supported. 

6.1 .3 Screen Orientation Events 

The following events shall be supported. 



Event Name 


Namespace 


Description 


DOM Interface 


Bubble 


Cane 


"screenOrientationPortrait" 




The screen orientation has 
changed to typical 'landscape' 
orientation 


ScreenOrientationEvent 


No 


No 


"screenOrientationLandscape" 




The screen orientation has 
changed to typical 'portrait' 
orientation 


ScreenOrientationEvent 


No 


No 



The namespace for these events is ffs. 

6.1.4 Other Events 

The events pausedevent and resumedevent from subclause 6.5.2 of [3] shall be supported. 
The following events, in the DIMS namespace, shall be supported. 



Name 


Definition 


Bubbles 


Cancellable 


"activatedEvent" 


Occurs when an element changes state from deactivated to activated 


No 


No 


"deactivatedEvent" 


Occurs when an element changes state from activated to deactivated 


No 


No 



6.2 



Remote interaction 



CUent-server communication is possible in the DIMS system using three different mechanisms: 

• The client can open a suitable URL. The set of valid URL forms is not specified in DIMS, and may include, for 
example, protocols such as HTTP [12], RTSP [16] orMailTo. 

• By establishing a socket connection between the client and the server using the Connection API in the uDOM 

[17]. 

• By using the HTTP specific SVG uDOM methods getURL or postURL [17]. 
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6.3 Scripting 



SVG Tiny 1 .2 contains a uDOM interface that provides linkage to a script engine and adds the possibility to modify the 
DOM representation of the scene from scripts. 

ECMAScript mobile profile (MP) [2] can be used in conjunction with the script and handler elements and SVG |aDOM 
API (Appendix A of [1]) in order to provide more powerful DOM manipulation, and interaction. 

UEs supporting the DIMS media type shall support ECMAScript mobile profile (MP) [2] with the following extensions 
to uDOM API. 

Table 1 adds to the table in subclause A. 8. 12 of [1]. It contains trait access rules for DIMS extensions. 

Table 1 : Trait access rules for DIMS extensions 



Attribute 


Trait Getter 


Trait Setter 


Default Values 


Description 


lsr:fullscreen 


getTraitNS[true | false] 


setTraitNS[true | false] 


false 


Available on <video> element 


dims:fullscreen 


getTraitNS[true | false] 


setTraitNS[true | false] 


false 


Available on the <svg> element 


lsr:x 


getFloatlraitNS 


setFloatTraitNS 


O.Of 


Origin x of the <rectClip> 


lsr:y 


getFloatTraitNS 


setFloatTraitNS 


O.Of 


Origin y of the <rectClip> 


lsr:width 


getFloatlraitNS 


setFloatTraitNS 


O.Of 


Width of the clipping region defined 
by <rectClip> 


lsr:height 


getFloatlraitNS 


setFloatTraitNS 


O.Of 


Height of the clipping region defined 
by <rectClip> 



Description of getFloatlraitNS and setFloatTraitNS methods: 

float getFloatTraitNS(in DOMString namespaceURI, in DOMString name) raises(DOMException); 

Same as getFloatTrait, but for namespaced traits. Parameter name shall be a non-qualified trait name, i.e. without 
prefix. 

Parameters: 

namespaceURI - the namespaceURI of the trait to retrieve. 
name - the name of the trait to retrieve. 

Return Value: 

the trait value as float. 

Exceptions: 

DOMException - with error code NOT_SUPPORTED_ERR if the requested trait is not supported on this element 
or null. 

DOMException - with eiTor code TYPE_MISMATCH_ERR if requested trait's computed value cannot be 
converted to a float. 

void setFloatTraitNS(in DOMString namespaceURI, in DOMString name, in float value) 
raises(DOMException) ; 

Same as setFloatTrait, but for namespaced traits. Parameter name shall be a non-qualified trait name, i.e. without 
prefix. 

Parameters: 

namespaceURI - the namespaceURI of the trait to be set. 
name - the name of the trait to be set. 
value - the value of the trait to be set as float. 
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Exceptions: 

DOMException - with error code NOT_SUPPORTED_ERR if the requested trait is not supported on this element 
or null. 

DOMException - with error code TYPE_MISMATCH_ERR if the requested trait's value cannot be specified as a 
float (for e.g. NaN) 

DOMException - with error code INVALID_ACCESS_ERR if the input value is an invalid value for the given 
trait or null. 



7 Transport 

7.1 Overview 

The transport mechanisms support rich media delivery in the following modes: Unicast download (HTTP/TCP [12] or 
MMS [6] protocol), broadcast/multicast download (FLUTE/UDP [24]), unicast streaming and broadcast/multicast 
streaming (RTP/UDP [15]). For download mode, reliability is guaranteed by existing mechanisms in the transport and 
network layers, and no error resilience tools need to be designed at the application layer for rich media delivery. 
However, rich media transport in streaming mode is more challenging, with UDP being unreliable. Therefore, the RTP 
design provides some error resilience tools to help the media decoder cope with unreliable transport. 

Rich media is a combination of continuous media and discrete media and relevant transport mechanisms for these two 
media types should be used. Rich media streaming is thus naturally realized by: 

a) streaming continuous media such as scene streams, video and audio; and 

b) downloading the discrete media, such as images. 

DIMS Units can be classified as either used in normal processing, or used only for 'redundant' processing. For a given 
DIMS data-stream, these two kinds of DIMS Units can be managed either: 

a) in a single transport; or 

b) in two separate transports. 

7.2 Storage in ISO Base IVIedia File Format Files 

7.2.1 Introduction 

DIMS streams, both primary streams (those containing SVG scenes) and secondary streams (which normally carry only 
updates) are carried in files of the ISO Base Media File Format [10] (including 3GP files [8]) according to this 
subclause. 

Either one or two tracks are used in the file for the normal and redundant DIMS Units. 

7.2.2 Stream Type 

Scenes are carried in scene tracks in ISO family files. They therefore use: 

(a) a video media handler 'vmhd'; 

(b) a media handler type of 'sdsm' (scene description media handler); 

(c) a derivative of the base SampleEntry in the sample description box. 

The timescale for the stream should be suitably chosen to achieve the desired accuracy of timing. 
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7.2.3 Track and Media Header fields 

The width and height in the track header shall be set in the desired ratio, and indicate the suggested minimum display 
size. A player on a system with an indefinitely large display, in the absence of a fullscreen request, could use this size as 
a suggested initial display size. 

If the presentation has an expected, reasonable duration, then it is encoded as the track duration. Otherwise the ISO file 
format recommendation of maxint for the duration, when it is indeterminate, should be used. 

The language code of the track should be set appropriately if the presentation is language-specific, or else the value 
'und' (undetermined) or 'mul' (multiple) should be used. 

7.2.4 Sample Dependency Table 

The sample dependency table may be used. The 'unknown' field values may be needed under some circumstances. The 
fields have the following semantics for DIMS streams: 

sample_depends_on should be set according to whether the sample contains a normal DIMS Unit (not is- 
redundant) with is-RAP set to 1 : 

0: unknown; 

1 : this sample does not contain a normal RAP; 

2: this sample does contain a normal RAP; 

3: reserved. 
sample_is_depended_on should be set according to the value of the P-bit in the DIMS Unit headers: 

0: unknown; 

1 : one or more DIMS Units have the P-bit set to 1 ; 

2: no DIMS Unit has the P-bit set to 1 (low-priority sample); 

3: reserved. 
sample_has_redundancy should be set to indicate whether the sample contains redundant DIMS Units: 

0: unknown; 

1 : one or more DIMS Units have the is-redundant bit set to 1 ; 

2: no DIMS Unit has the is-redundant set to 1 ; 

3: reserved. 
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7.2.5 Sample Entry Name and Format 

The sample entry four-character code for scenes is 'dims'. The configuration box shall be present in the sample entry. 

class SceneConf iguration extends FullBox ( ' dimC ' ) ( 

unsigned int(8) profile; 

unsigned int(8) level; 

unsigned int(4) pathComponents; 

unsigned int(l) useFullRequestHost; 

unsigned int(l) stream_type; 

unsigned int(2) contains_redundant; 

string text_encoding; 

string content_script_types; 

string content-coding; 
} 
class MPEG4BitRateBox extends Box('btrt'){ 

unsigned int(32) buf ferSizeDB; 

unsigned int(32) maxBitrate; 

unsigned int(32) avgBitrate; 
} 

class DIMSSampleEntry ( ) extends SampleEntry ('dims')! 

SceneConf iguration config; // mandatory 

MPEG4BitRateBox bitrateinfo; // optional 
} 

The fields have the following semantics: 

• stream_type - takes the value 1 for primary streams, and the value for secondary streams. Files containing 
secondary streams are not normally playable by themselves, outside the context of the scene(s) they are designed 
to update. 

• contains_redundant - takes the value 1 if the stream contains only DIMS Units with is-redundant set to 0, 
the value 2 if the stream contains only DIMS Units with is-redundant set to 1, and takes the value 3 if both occur. 
The value is reserved. Note that streams containing only redundant units must be linked to the normal stream 
for which they are redundant (see subclause 7.2.9). 

• text_encoding - is a null terminated string with possible values taken the XML specification for character 
encoding in entities (e.g. subclause 4.3.3 of XML 1.0 Fourth edition [25]). It describes the text encoding after the 
content has been de-compressed (e.g. after deflating). 

• content-coding - this field provides the identification of the compression scheme. It is a null terminated 
string specifying the encoding (compression) format of the content. It is defined in the same way as the content- 
coding header in HTTP (subclause 3.5 of [12]). 

• content_script_types - is a null terminated string that identifies the scripting languages used. It is a 
comma-separated list of MIME types [18] from the lANA registry, such as "application/ecmascript" (see [13]). It 
shall provide a complete listing of the script types used in the stream. 

• buf ferSizeDB gives the size of the decoding buffer for the elementary stream in bytes. This is the size of the 
largest buffer needed to hold a sample in textual format, in bytes (i.e. after any de-compression). 

• maxBitrate gives the maximum rate in bits/second over any window of one second. 

• avgBitrate gives the average rate in bits/second over the entire presentation. 

• useFullRequestHost and pathComponents are defined in subclause 5.5.2. 

The text_encoding is required to be consistent over all the DIMS units described by this sample entry. This simplifies 
processing. It is an error to have a mismatch between this value and those present in the XML of the DIMS units 
themselves. 

7.2.6 Sample Format 

A sample is a concatenated sequence of one or more DIMS Units associated with the same media time, with a two-byte 
length field in network (big-endian) format preceding each DIMS Unit. The length is the length of the DIMS Unit not 
including the length field itself (that is, the combined length of the DIMS Unit Body and DIMS Unit Header). 
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A sample may contain one or more Normal DIMS Units or Redundant DIMS Units, or both, associated with the same 
media time. 

7.2.7 Other Resources 

Other resources may be carried in the meta-data directories of ISO files, in the track containing the scene, the movie 
containing that track, or the file containing that movie. If there is no actual meta-data (the meta-data block is there 
merely to carry resources), the meta-data handler type 'null' may be used. 

URL forms to address these resources are defined in the ISO specification, and are relative to the file containing the 
resource. 

The meta data box may also be used for multi-scene presentations where the meta box includes the initial SVG scene, 
and one of the tracks provides the updates. 

7.2.8 Sync Samples 

The sync sample table marks samples in which any of the DIMS Units have the is-RAP bit set to 1. 
NOTE: The use of the shadow sync box is deprecated. 

7.2.9 Separate Redundant Track 

Redundant DIMS Units may be stored in the file format using a separate track. The redundant track shall be linked to 
the matching normal track by a track reference of type 'swto' in the redundant track. 

Redundant tracks are identified by this track reference, and shall also have contains_redundant set to "redundant data 
only" in their sample entry. The track they link to shall have contains_redundant set to "normal data only". 

If a stream is converted from a single-track to two-tracks, some small adjustment may be needed. Specifically, any 
'normal' DIMS Units following the redundant-exit indication in the same sample will need to be copied into the 
'redundant' track, marked as 'redundant' DIMS Units, and the redundant-exit indication moved to the last such DIMS 
Unit in the sample. 

A terminal may perform tune-in etc. using the 'redundant' track by: 

a) finding the random access point in the redundant track, closely preceding the desired play point, by using the 
sync sample table; 

b) processing DIMS Units from the redundant track until the redundant-exit indication; 

c) following the 'swto' track reference and commencing processing at the temporally next sample in the linked 
(main) track. 

7.3 RTP Payload format for DIMS Streams 
7.3.1 Priority 

The counter (CTR) field is used to detect the loss of high priority DIMS units. Encoders and decoders keep a running 
value of the counter; the encoder places in each packet the current value of the counter; after being placed in the packet, 
the running counter is incremented by one if that packet contains one or more DIMS Units with high priority. The 
decoder compares the CTR field of each incoming packet with its running counter, and thereby checks for high-priority 
loss. After the check, the decoder's running counter is incremented by one if the received packet contains one or more 
DIMS Units with high priority. 

NOTE: A discontinuity in the sequence number indicates a lost packet. A discontinuity in the CTR field indicates 
the number of prioritized packets which have been lost. 



£75/ 



3GPP TS 26.142 version 7.0.0 Release 7 



28 



ETSI TS 126 142 V7.0.0 (2007-06) 



An example of the use of the CTR and priority (P) bits is shown below: 



Packet lost 



PKT 

p=i 

CTR=5 



PKT 

p=i 

CTR=6 



PKT 

p=o 

CTR=7 



PKT 

p=i 

CTR=7 




The expected value of CTR after the last received packet was 7, and as the value of CTR 
did not increase during the packet loss it can be estabhshed that the lost packet(s) had no 
DIMS Data Units with priority P=l. 



Packet lost 



PKT 

p=i 

CTR=5 



PKT 

p=i 

CTR=6 



PKT 

p=i 

rTR=7 



PKT 

P=0 
CTR=0 




The expected value of CTR after the last received packet was 7, and as the value of CTR 
increased from 7 to during the packet loss it can be established that a prioritized packet, 
one or more high-priority DIMS Data Units, was lost. 



Figure 7-1 : Example of prioritization including detection of lost prioritized packets 

Note that loss is only detected on the next packet to arrive; if the content has long periods in which no packets are sent, 
or is otherwise bursty, it may be inadvisable to have a high-priority packet before a long silence interval, as its loss 
cannot be detected until the first packet after that interval, at the earliest. 



7.3.2 RIP Packet format 



7.3.2.1 



Introduction 



In the context of the present document (specifically the MIME type defined in subclause 11.1), the units carried by the 
RTP Payload Format are DIMS Units. The RTP payload format defines two basic packet structures: 

a) packets containing one or more entire units; 

b) packets containing a single fragment of a unit. 

Depending on the underlying network and the unit size, it may be desirable to split units or aggregate them. 
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7.3.2.2 RTP Header Usage 

The RTP header is defined in [15] and its use in this payload format is described below. 

12 3 

01234567890123456789012345678901 

+ - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + 

V=2|P|X| CC |M| PT I sequence number 
+ - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + 

time St amp 

+ - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + 

synchronization source (SSRC) identifier 

contributing source (CSRC) identifiers 

+- + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + 

Figure 7-2: RTP HEADER 

Marker bit (M): 1 bit - The marker bit is set for the last packet associated with a timestamp. 

NOTE: This is useful when a scene is sent as a combination of a smaller scene and a series of scene commands in 
separate packets. In this case the marker bit of the packet containing the last scene command is to be set. 
This is in line with the normal use of the marker bit in video coding and enables efficient buffering. 

Timestamp: 32 bits - The timestamp indicates the rendering instant of the unit(s). 

The usage of the remaining RTP header fields follows the rules of [15]. 

7.3.2.3 Common Packet Header 

The RTP payload comprises of a common header and has the following format: 

+ + 

0|1|2|3|4|5|6|7 

R|A| T I CTR 

+ + 

Figure 7-3: COIVIIVION PAYLOAD HEADER 

R: 1 bit 

The R bit is reserved, shall be set to 0, and shall be ignored by the receiver. 

A: 1 bit 

When set to one, the A bit indicates that the packet contains one or more random access points (in DIMS, DIMS 
Units with is-RAP set), or the first fragment of a random access point. 

T: 3 bits 

The payload type as defined in table 2; Reserved values shall not be used, and packets with reserved values of the 
type field shall be discarded and not processed. 

Table 2: Summary of RTP Payload Types and Descriptions 



Type 


Description 





Aggregation packet 


1 


Fragmentation start Pacl<et 


2 


Fragmentation continuing Pacl<et 


3 


Fragmentation end Packet 


4 to 7 


Reserved 
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CTR: 3 bits 

The CTR is used to detect the loss of one or more high-priority units as documented in subclause 7.3.1. 

7.3.2.4 Aggregation Packet 

These packets contain one or more complete units with the same timestamp. The common header values are: 

• Type: 0. 

• A (RAP): as needed. 

The RTF payload is presented below. 

12 3 

01234567890123456789012345678901 

+ - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + 

Header (Type=0 ) | first Unit length | : 

+-+-+-+-+-+-+-+-+ unit I 

+ - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + 
I |... OPTIONAL RTF padding 

+ - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + 

Figure 7-4: Aggregation Packet payload format 

The units are placed in the RTF payload, in sequence, possibly following by RTF padding. Each unit is preceded by a 
two-byte length in network (big-endian) byte order. The length is the length of the following unit (both header and 
body), not including the length field itself. 

7.3.2.5 Fragmentation Packets 

Frames that exceed the networks maximum transmission unit (MTU) need to be fragmented before transmission. By 
fragmenting at the RTF level one need not rely on lower layer fragmentation, e.g. IF. 

The payload format defines fragmentation of units into two or more RTF packets. 

NOTE: Fragmentation on the RTF level should however be seen as a solution only when fragmentation on the 
DIMS level is not possible. Fragmentation can be performed by splitting, for example, a scene into a 
scene and a number of scene updates. In this way packets can be created that are smaller than MTUs and 
can be decoded individually, which gives better error resilience when packets are lost. 

The common header values are as follows. 

• Type: 1, 2, or 3. 

• A (RAF): as needed in first fragment, and in all other fragments. 

• CTR: shall be identical in all the packets of a fragmented unit; increments after the last fragment depending on 
the priority of the unit. 

Fragments consist of an integer number of consecutive octets of a unit. Fragments of a unit shall be sent as a group and 
in consecutive order with respect to RTF sequence numbers. The first fragment shall be marked as type 1 and the last 
fragment shall be marked as type 3. Other fragments shall be marked as type 2. 

The unit is complete with header. The header is not repeated in fragments. There is no length field. 
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12 3 

01234567890123456789012345678901 

+ - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + 

Header (type=l ) | Header | I 

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ I 

I I 

I Partial Unit payload I 

I 

I +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 

: . . .OPTIONAL RTP padding 

+-+-+- + - + - + - + - + - + - + - + - + - + - + - + -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 

12 3 

01234567890123456789012345678901 

+ - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + 

Hdr (tYpe=2/3) | 
+-+-+-+-+-+-+-+-+ I 

I 

Partial Unit payload I 

I 

: . . .OPTIONAL RTP padding 

+ - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + 

Figure 7-5: FRAGMENTATION PACKET FORMATS 

7.3.3 SDP Parameters 

The Session Description specifies the clock rate, version profile and level. The fields in the Session Description 
Protocol (SDP) are defined as follows: 

• The media name in the "m=" line of SDP shall be video. 

• The encoding name in the "a=rtpmap" line of SDP shall be richmedia+xml. 

The clock rate in the "a=rtpmap" line is not specified in the present document. The resolution of the clock should be 
sufficient for the desired synchronization accuracy and for measuring packet arrival jitter. The clock rate of the 
referenced continuous media files within the presentation needs to be considered. For example, if the presentation 
contains referenced video which is to be synchronized with the presentation, the clock rate should be no less than 
90,000. 

The MIME parameters in subclause 11.1, when present, shall be included in the "a=fmtp" line of SDP. These 
parameters are expressed as a MIME media type string, in the form of a semicolon separated list of parameter=value 
pairs. 

An example of a media-level description in SDP format is shown below. 

m=video 12345 RTP/AVP 96 
a=rtpmap;96 richmedia+xml/100000 
a=fmtp;96 Version-prof ile=10; Level=20; 



7.3.4 Separate Redundant Stream 



Redundant DIMS Units may be carried in RTP in a separate stream. The redundant stream shall be linked to the 
matching normal stream. This is done using the media identification and group attributes as specified in [22]. Both the 
stream containing main DIMS Units, and the stream containing redundant DIMS Units shall have a 'mid' (media 
identification) attribute, and they shall be placed in a 'group' attribute of which the type is ffs. 

Redundant streams have contains -redundant set to "redundant data only". The stream they are connected to shall have 
contains-redundant set to "normal data only". 

A terminal may perform tune-in etc. using the 'redundant' stream by: 

a) looking for a random access point in the redundant stream, or main stream; 
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b) if the random access point was in the redundant stream, processing DIMS Units from the redundant stream until 
the redundant-exit indication; 

c) continuing processing at the temporally next DIMS Unit in the normal stream. 
An example might be: 

v=0 

o=adam 289083124 289083124 IN IP4 host.example.com 

t=0 

c=IN IP4 131.160.1.112 

a=group : group_name 1 2 

m=video 30000 RTP/AVP 97 

a=rtpmap:97 richmedia+xml/90000 

a=fmtp:97 version-prof ile=10; level=10; contains-redundant="normal" 

a=mid: 1 

m=video 30002 RTP/AVP 98 

a=rtpmap:98 richmedia+xml/90000 

a=fmtp:98 version-prof ile=10; level=10; contains-redundant="redundant " 

a=mid: 2 

The group name and definition is ffs. 



8 Profiles and Levels 

8.1 Profiles 

8.1.1 Introduction 

A profile indicator in a stream indicates which features (also known as tools) are required to be supported on a terminal. 

Profile indications are 8-bit integers. Only one profile is defined by the present document; other profiles may be defined 
in future. 

8.1.2 Mobile profile 

Mobile Profile : Profile Indicator Value 10. 

Support for the following media types is also required in profile 10: 

• Support for images. 

• Support for embedded audio in 3GP and AMR files. 

• Support for embedded video in 3GP files. 

As required in the SVG specification, SVG fonts shall be supported. The lack of hinting in SVG fonts means that small 
text which is anti-aliased may become unreadable. This problem is even more evident when text is rotated or animated. 
Recommendation: SVG fonts should be used with care. 

The Open Font Format [4] should be supported at advanced simple text profile, level 2, with the following constraints: 
if Open Type fonts are supported, the DIMS client shall support downloadable OpenType fonts with TrueType outlines, 
TrueType hinting shall be supported for improved text readability, and advanced typographic features may be 
supported. 

NOTE 1 : When OpenType fonts are supported, download of them may be initiated using the font-face-uri element 
from [1]. 

Device-native fonts and fonts identified by generic family names may be used. 

Uncompressed XML shall be supported. XML compressed with GZIP [11] shall be supported. 
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NOTE 2: Sub-systems may also require support for other media types (e.g. video) or codecs within those types 
(e.g. H.263) when support for DIMS is required. 

8.2 Levels 

8.2.1 Introduction 

Level indicators provide a way to measure the degree of support required in a terminal to render a given scene or scene 
stream satisfactorily. 

The following level constraints are to be respected by the content. DIMS Implementations will be able to use the level 
indicator to optimize the rendering of the content. 

8.2.2 Level Axes 

This subclause is ffs. 

Levels are measured on the following axes: 

1) Bitrate of the scene stream, including the initial scene, embedded graphics, audio, video, etc. (That is, the 
minimum bit-rate channel over which the scene could be delivered in a real-time fashion). 

2) Overall memory requirements, including DOM tree, buffers, animation values, scripts and their state including 
variables, and so on. The size of the DOM tree is measured by the number of nodes in the tree; the number of 
attributes, or the size of their values, is not calculated. 

3) Required frame rate for animations. 

4) The maximum number of simultaneously playing video streams. 

5) The maximum number of simultaneously playing audio streams. 

6) The maximum number of simultaneously active DIMS Scenes. 
7)The maximum number of animations that run concurrently. 

8) The minimum screen space needed to display the scene. 

9) The maximum permitted update rate, averaged over any period of 1 second. 
The following subclauses define the available levels. 

8.2.3 LeveMO definition 

This subclause is ffs. 

This level contains the following restriction: 

• Only one instantiation of a DIMS Scene is allowed. 

• Only one Video instantiation along with a DIMS Scene is allowed. 

On the Video Element, the attribute transformBehavior shall be restricted to values "pinned I pinned90 I pinned 180 I 
pinned270", and the attribute overlay shall be restricted to values "top". 

The following limits also apply: 



Level 


Rate 


DOM nodes 


Frame rate 


#Video 


#Audio 


#Anims 


Screen size 


10 


256 kbit/s 


300 


15 


1 


1 


10 


160x120 
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8.2.4 LeveM 00 definition 

This high-end level does not contain any restrictions. 

9 Content usage guidelines 

The SVGT1.2 guidelines defined in ANNEX L of PSS release 7 [7] apply here. 

NOTE: The recommendation of L.2.8 should be read as including downloadable fonts: "Usage of device or 

system fonts, or downloaded OpenType fonts, is recommended. SVG fonts should be used with care." 

Content creators should take into account the DIMS levels definitions. 

10 Security and Content Protection Considerations 

DIMS does not define a security framework. DIMS relies instead on the security frameworks already defined for the 
mechanisms DIMS uses (e.g. for ECMAScript), and the frameworks provided by the platforms on which DIMS runs. 

When content requests fullscreen video and especially fullscreen scenes, it is possible for the content to mimic the 
normal look of the device (the 'desktop' of a computer screen, for example) and persuade the user to enter potentially 
secure or private information into a presentation while thinking that they are interacting with the local system. This is 
sometimes called "phishing". Care should be taken to handle content that uses fullscreen requests, such that the user is 
always aware of when DIMS content is filling the screen. 

DIMS content can embed scripts. Care should be taken to limit, to the presentation in which they occur, the access that 
these scripts have. For example, it would normally be inappropriate for these scripts to have access to the local file 
system outside the scope of presentation. 

Authors of web-sites that embed DIMS content, when the scripts in the DIMS content are not under the control of the 
web site - for example, if the DIMS content is fetched from another site, or uploaded to the web-site by users — should 
exercise caution. The embedded scripts may have access to the content of, and interaction of, the web site that embeds 
them, even though they were not authored by, or provided from, that web site. 



1 1 Registered Types 



1 1 .1 RTP Payload format MIME Type 

Type name: video 

Subtype name: richmedia-nxml 

Required parameters: 

• Version-profile - Specifies the profile of DIMS used, for example the value indicating Mobile Profile. 

• Level - Specifies the minimum DIMS level needed to be able to display the scene. 
Optional parameters: 

• stream-type - takes the value "primary" for primary streams (in which every random access point is a scene), 
and the value "secondary" for secondary streams. Secondary streams are not normally playable by themselves, 
outside the context of the scene(s) they are designed to update. 

• contains-redundant - takes the value "normal" if the stream contains only DIMS Units with is-redundant set 
to 0, the value "redundant" if the stream contains only DIMS Units with is-redundant set to 1, and takes the value 
"normalH-redundant" if both occur. Note that streams containing only redundant units must be linked to the 
matching stream carrying normal DIMS Units (see subclause 7.3.4). 
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• text-encoding - is a string enclosed in double-quotes with possible values taken the XML specification for 
character encoding in entities (e.g. subclause 4.3.3 of [25]). It describes the text encoding after the content has 
been decompressed (e.g. after deflating). The default value is "UTF-8" [9]. This field is only applicable if the 
content is transmitted as (possibly encoded) text. 

• content-script-types - is a string enclosed in double-quotes that identifies the scripting languages used. It 
is formatted as a comma-separated list of MIME types [18] from the lANA registry, such as 
"application/ecmascript" (see [13]), or the empty string, indicating no scripting is used. The default value is the 
empty string. It shall provide a complete listing of the script types used in the stream. 

• content-coding - this field provides the identification of the compression scheme. It is a string specifying the 
encoding (compression) format of the content. It is defined in the same way as the content-coding header in 
HTTP (subclause 3.5 of [12]). 

• useFullRequestHost takes the value "0" or "1"; the definition of this parameter is in subclause 5.5.2. 

• pathComponents takes a value between "0" and "15"; the definition of this parameter is in subclause 5.5.2. 
Encoding considerations: 

• This media type is currently only defined for transport via RTP. 
Security considerations: 

• RTP packets using the payload format defined in the present document are subject to the security considerations 
discussed in the RTP specification [15] and any applicable RTP profile, e.g., AVP [21]. 

Interoperability considerations: 

• None. 
Published specification: 

• 3GPPTS 26.142. 
Applications that use this media type: 

• DIMS Streaming applications. 
Additional information: 

• Magic number(s): None. 

• File extension(s): None. 

• Macintosh file type code(s): None. 

Person and email address to contact for further information: 

• Clinton Priddle. 

• clinton.priddle@ericsson.com. 

• Multimedia Technologies, Ericsson. 
Intended usage: 

• COMMON. 
Restrictions on usage: 

• None. 
Author: 

• 3GPP SA4 WG. 
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Change controller: 

• 3GPP TSG SA. 

1 1 .2 'Codecs' Parameter for 3GP files 

When DIMS content is supplied in 3GP files which are identified by MIME type, the 'codecs' parameter defined in [14] 
may be used to indicate that DIMS content is present. The codecs parameter takes the sample entry name as defined 
above (that is, 'dims'). 
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Annex A (normative): 
Conformance Criteria 



DIMS constructs scenes which are possibly updated over time. Conformant terminals shall support the delivery of the 
scenes and updates in formats specified in the 'compression' subclause above, and in the transport environments 
specified in the 'transport' subclause. 

For the initial scene, a DIMS Scene can be extracted from the transport, and de-compressed if necessary, yielding an 
XML document. This XML is referred to here as the "initial DIMS document". Similarly, after all updates for a given 
instant have been applied to the scene tree, there is logically an XML document that is equivalent to the scene DOM 
tree; these are called "subsequent DIMS documents" here. 

Initial and subsequent DIMS documents shall conform to all of: 

• the conformance requirements in Appendix D of [1]; with the following exceptions: 

The conformance criteria in the SVG specification regarding codecs do not apply for the DIMS media type. 
Clause D. 4 is not in scope for DIMS. 
Clause D. 7 is not in scope for DIMS. 

• the conformance requirements of the LASeR Commands and LASeR scene extensions as specified in 
ISO/IEC 14496-4/AMD20 (LASeR Conformance); 

• the limitations of the profile and level indications under which they are delivered. 
Conformance of the DIMS extensions is for further study. 
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Annex B (informative): 
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Change history 


Date 


TSG# 


TSG Doc. 


CR 


Rev 


Subject/Comment 


Old 


New 


2007-06 


36 


SP-070312 






Approved at SA#36 Plenary 




7.0.0 



















































































































£75/ 



3GPP TS 26.142 version 7.0.0 Release 7 



39 



ETSI TS 126 142 V7.0.0 (2007-06) 



History 



Document history 


V7.0.0 


June 2007 


Publication 



























£75/ 



